Bad text recognition because language in "detect text from picture" is hard-coded english (tesseract ocr) #30280
Labels
area/web interface
Related to the Mastodon web interface
bug
Something isn't working
status/to triage
This issue needs to be triaged
Steps to reproduce the problem
If you upload a picture to a toot on mastodon, there is the possibility to make a description of the picture with "detect text from picture". But if text in picture is others than english, the recognize results are very poor.
Expected behaviour
A good recognition of text in a picture
Actual behaviour
A poor recognition (except language is english)
Detailed description
This feature works with tesseract. Tesseract, that works with dictionaries, got by far best results, if the language to recognize fits to the language of the text in the picture. But in sourcecode there is english as language hard-coded.
At least the language should be set to the language of the client language, that the user has set. Or better should be choicable.
Mastodon instance
social.tchncs.de
Mastodon version
4.2.8
Browser name and version
Firefox 105.0.3
Operating system
Win 10
Technical details
The language settings ar in file:
focal_point_modal.jsx
The text was updated successfully, but these errors were encountered: